-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docs] Improved inpaint docs #5210
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice start, left some comments
Running diffusion models is computationally expensive and intensive, but with a few optimization tricks, it is entirely possible to run them on consumer and free-tier GPUs. For example, you can use a more memory-efficient form of attention such as PyTorch 2.0's [scaled-dot product attention](../optimization/torch2.0#scaled-dot-product-attention) or [xFormers](../optimization/xformers) (you can use one or the other, but there's no need to use both). You can also offload the model to the GPU while the other pipeline components wait on the CPU. | ||
|
||
```diff | ||
+ pipeline.enable_model_cpu_offload() | ||
+ pipeline.enable_xformers_memory_efficient_attention() | ||
``` | ||
|
||
With [`torch.compile`](optimization/torch2.0#torch.compile), you can boost your inference speed even more by wrapping your UNet with it: | ||
With [`torch.compile`](../optimization/torch2.0#torch.compile), you can boost your inference speed even more by wrapping your UNet with it: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we're using this block of text across many guides now. How can we focus more on the objectives of the guides themselves and redirect users for the optimization related details, instead?
(Doesn't need to be addressed in this PR and I am open to brainstorming)
## Optimize | ||
|
||
It can be difficult and slow to run diffusion models if you're resource constrained, but it dosen't have to be with a few optimization tricks. One of the biggest (and easiest) optimizations you can enable is switching to memory-efficient attention. If you're using PyTorch 2.0, [scaled-dot product attention](../optimization/torch2.0#scaled-dot-product-attention) is automatically enabled and you don't need to do anything else. For non-PyTorch 2.0 users, you can install and use [xFormers](../optimization/xformers)'s implementation of memory-efficient attention. Both options reduce memory usage and accelerate inference. | ||
|
||
You can also offload the model to the GPU to save even more memory: | ||
|
||
```diff | ||
+ pipeline.enable_xformers_memory_efficient_attention() | ||
+ pipeline.enable_model_cpu_offload() | ||
``` | ||
|
||
To speed-up your inference code even more, use [`torch_compile`](../optimization/torch2.0#torch.compile). You should wrap `torch.compile` around the most intensive component in the pipeline which is typically the UNet: | ||
|
||
```py | ||
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True) | ||
``` | ||
|
||
Learn more in the [Reduce memory usage](../optimization/memory) and [Torch 2.0](../optimization/torch2.0) guides. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah after this section, I definitely think we should think a bit more about how we can organize and distribute these components that are heavily reused across different docs.
(Doesn't have to be addressed in this PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the changes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome, thank you!
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice that works for me - it would be great though to make sure that we only change .md files here :-)
* start * finish draft * add section * edits * feedback * make fix-copies * rebase
Part of #4758 to update the inpainting guide to be more extensive and in-depth